25 research outputs found
Loghub: A Large Collection of System Log Datasets towards Automated Log Analytics
Logs have been widely adopted in software system development and maintenance
because of the rich system runtime information they contain. In recent years,
the increase of software size and complexity leads to the rapid growth of the
volume of logs. To handle these large volumes of logs efficiently and
effectively, a line of research focuses on intelligent log analytics powered by
AI (artificial intelligence) techniques. However, only a small fraction of
these techniques have reached successful deployment in industry because of the
lack of public log datasets and necessary benchmarking upon them. To fill this
significant gap between academia and industry and also facilitate more research
on AI-powered log analytics, we have collected and organized loghub, a large
collection of log datasets. In particular, loghub provides 17 real-world log
datasets collected from a wide range of systems, including distributed systems,
supercomputers, operating systems, mobile systems, server applications, and
standalone software. In this paper, we summarize the statistics of these
datasets, introduce some practical log usage scenarios, and present a case
study on anomaly detection to demonstrate how loghub facilitates the research
and practice in this field. Up to the time of this paper writing, loghub
datasets have been downloaded over 15,000 times by more than 380 organizations
from both industry and academia.Comment: Dateset available at https://zenodo.org/record/322717
Retromorphic Testing: A New Approach to the Test Oracle Problem
A test oracle serves as a criterion or mechanism to assess the correspondence
between software output and the anticipated behavior for a given input set. In
automated testing, black-box techniques, known for their non-intrusive nature
in test oracle construction, are widely used, including notable methodologies
like differential testing and metamorphic testing. Inspired by the mathematical
concept of inverse function, we present Retromorphic Testing, a novel black-box
testing methodology. It leverages an auxiliary program in conjunction with the
program under test, which establishes a dual-program structure consisting of a
forward program and a backward program. The input data is first processed by
the forward program and then its program output is reversed to its original
input format using the backward program. In particular, the auxiliary program
can operate as either the forward or backward program, leading to different
testing modes. The process concludes by examining the relationship between the
initial input and the transformed output within the input domain. For example,
to test the implementation of the sine function , we can employ its
inverse function, , and validate the equation . In addition to the
high-level concept of Retromorphic Testing, this paper presents its three
testing modes with illustrative use cases across diverse programs, including
algorithms, traditional software, and AI applications
A Review of Modeling and Diagnostic Techniques for Eccentricity Fault in Electric Machines
Research on the modeling and fault diagnosis of rotor eccentricities has been conducted during the past two decades. A variety of diagnostic theories and methods have been proposed based on different mechanisms, and there are reviews following either one type of electric machines or one type of eccentricity. Nonetheless, the research routes of modeling and diagnosis are common, regardless of machine or eccentricity types. This article tends to review all the possible modeling and diagnostic approaches for all common types of electric machines with eccentricities and provide suggestions on future research roadmap. The paper indicates that a reliable low-cost non-intrusive real-time online visualized diagnostic method is the trend. Observer-based diagnostic strategies are thought promising for the continued research
ROME: Testing Image Captioning Systems via Recursive Object Melting
Image captioning (IC) systems aim to generate a text description of the
salient objects in an image. In recent years, IC systems have been increasingly
integrated into our daily lives, such as assistance for visually-impaired
people and description generation in Microsoft Powerpoint. However, even the
cutting-edge IC systems (e.g., Microsoft Azure Cognitive Services) and
algorithms (e.g., OFA) could produce erroneous captions, leading to incorrect
captioning of important objects, misunderstanding, and threats to personal
safety. The existing testing approaches either fail to handle the complex form
of IC system output (i.e., sentences in natural language) or generate unnatural
images as test cases. To address these problems, we introduce Recursive Object
MElting (Rome), a novel metamorphic testing approach for validating IC systems.
Different from existing approaches that generate test cases by inserting
objects, which easily make the generated images unnatural, Rome melts (i.e.,
remove and inpaint) objects. Rome assumes that the object set in the caption of
an image includes the object set in the caption of a generated image after
object melting. Given an image, Rome can recursively remove its objects to
generate different pairs of images. We use Rome to test one widely-adopted
image captioning API and four state-of-the-art (SOTA) algorithms. The results
show that the test cases generated by Rome look much more natural than the SOTA
IC testing approach and they achieve comparable naturalness to the original
images. Meanwhile, by generating test pairs using 226 seed images, Rome reports
a total of 9,121 erroneous issues with high precision (86.47%-92.17%). In
addition, we further utilize the test cases generated by Rome to retrain the
Oscar, which improves its performance across multiple evaluation metrics.Comment: Accepted by ISSTA 202